<html>
<body style="padding:20px;font-family:arial;">

<div style="width:500px;">

<h2>Sequence Project Parameters</h2>
<p>
This page provides all the setup parameters for a sequence project.
<p>
Most important are the following:
<p>
<b>sequence_files</b>
<p>
Select the fasta-formatted sequence files, or directories
of sequence files, to upload to the project. There must be at least
one sequence file. Note that a file can contain multiple sequences. 
<p>

<b>anno_files</b>
<p>
Select gff3-formatted annotation files corresponding to your sequences. SyMAP currently
recognizes annotations of type <tt>gene</tt>,<tt>exon</tt>,<tt>centromere</tt>,<tt>gap</tt>,
and <tt>CDS</tt> (treated the same as <tt>exon</tt>). The type is read from column 3 of
the gff file. Annotation is optional but highly recommended. 
<p>
<b>grp_prefix</b>
<p>
If provided, this allows SyMAP to remove the prefix from the chromosome or contig 
names and use the remaining part as a shorter name (e.g. "100" instead of "contig_100"),
saving space in the display. 
 Sequences which don't match the prefix will use their full names.
<p>
<b>grp_sort</b>
<p>
How your sequences should ordered, either numerically, alphabetically, or in the order
found in the fasta file (this only works if they are all in one file). Note that you
can change the ordering from "file" to one of the others, without reloading the files.
<p>
The other parameters are described in the following table. The last column (Re-run) tells
what processing (if any) must be re-done in order for the parameter change to take effect.
The codes in this column are as follows:
<ul>
<li>"-" : takes effect immediately
<li> "S" : re-run synteny computation
<li> "LA" : reload annotation, followed by synteny re-run
<li> "A" : re-run both alignment (MUMmer) and synteny 
<li> "L" : reload project, re-run all alignments
</ul>

<table border='1' cellpadding='5' rules=all >
<tr>
	<td align="left">
		<b>Parameter</b>
	</td>
	<td align="left" >
		<b>Description</b>
	</td>
	<td align="left">
		<b>Default Value</b>
	</td>
	<td align="left">
		<b>Re-run<b>
	</td>
</tr>
<tr>
	<td align="left" valign="top">
		category
	</td>
	<td align="left"  valign="top" width="400" >
		Category label for the project, currently only used in the 
		Summary list of the Project Manager window.
	</td>
	<td align="left"  valign="top" >
		Uncategorized
	</td>
	<td align="center" valign="top">
		-
	</td>
</tr>
<tr>
	<td align="left" valign="top">
		display_name
	</td>
	<td align="left"  valign="top" width="400" >
		A reader-friendly name for the project. Can contain any characters 
		and multiple words, but shorter names will work better in the displays
	</td>
	<td align="left"  valign="top" >
		defaults to the project database name
	</td>
	<td align="center" valign="top">
		-
	</td>
</tr>
<tr>
	<td align="left" valign="top">
		grp_type
	</td>
	<td align="left"  valign="top" width="400" >
		How to refer to the individual sequences, e.g. as "Chromosome" or "LG" or "Contig" 
	</td>
	<td align="left"  valign="top" >
		Chromosome
	</td>
	<td align="center" valign="top">
		-
	</td>	
</tr>
<tr>
	<td align="left" valign="top">
		order_against
	</td>
	<td align="left"  valign="top" width="400" >
		For draft contig sets, this allows you to order them using synteny to 
		one of the other projects. 
		The ordering occurs automatically after the synteny alignment is run, and a file showing 
		the ordering is printed into the project folder. A new "Anchored" project is also created,
		in which the draft contigs are concatenated into pseudomolecules based on the synteny.
	</td>
	<td align="left"  valign="top" >
		None
	</td>
	<td align="center" valign="top">
		S
	</td>
</tr>
<tr>
	<td align="left" valign="top">
		mask_all_but_genes
	</td>
	<td align="left"  valign="top" width="400" >
		Mask out all non-genic parts of the sequences before running MUMmer (gene annotation must
		be provided). Can save time but prevents non-annotated anchors from being found. 
	</td>
	<td align="left"  valign="top" >
		no
	</td>
	<td align="center" valign="top">
		A
	</td>	
</tr>
<tr>
	<td align="left" valign="top">
		min_size
	</td>
	<td align="left"  valign="top" width="400" >
		Minimum size of sequence to load, in bp; smaller sequences will be ignored. Note that annotations
		for ignored sequences will also be ignored, but some error messages will print.
	</td>
	<td align="left"  valign="top" >
		100000
	</td>
	<td align="center" valign="top">
		L
	</td>	
</tr>
<tr>
	<td align="left" valign="top">
		min_display_size_bp
	</td>
	<td align="left"  valign="top" width="400" >
		For the dotplot display, in case some of your sequences are much shorter than others. Set this value to a 
		minimum basepair size and all sequences will be scaled to appeare at least this large
		shown in a dotplot.
	</td>
	<td align="left"  valign="top" >
		0
	</td>
	<td align="center" valign="top">
		-
	</td>	
</tr>
<tr>
	<td align="left" valign="top">
		description
	</td>
	<td align="left"  valign="top" width="400" >
		Description of the project. Currently, this is only shown in the Project Manager Summary 
		section.
	</td>
	<td align="left"  valign="top" >
		
	</td>
	<td align="center" valign="top">
		-
	</td>	
</tr>
<tr>
	<td align="left" valign="top">
		annot_keywords
	</td>
	<td align="left"  valign="top" width="400" >
		SyMAP parses keyword/value pairs from the attributes field (last column) of the annotation
		gff files. Provide keywords here (separated by commas) and only these keywords will
		be parsed.  
	</td>
	<td align="left"  valign="top" >
		
	</td>
	<td align="center" valign="top">
		LA
	</td>	
</tr>
<tr>
	<td align="left" valign="top">
		annot_kw_mincount
	</td>
	<td align="left"  valign="top" width="400" >
		Set this number to exclude spurious attribute keywords.
		If <tt>annot_keywords</tt> is empty, then SyMAP will parse all key/value attribute pairs,
		as long as they have at least this number of occurrences in the gff file. 
	</td>
	<td align="left"  valign="top" >
		0
	</td>
	<td align="center" valign="top">
		LA
	</td>	
</tr>
</table>


</div>
</body>
</html>